Skip to main content

View Post [edit]

Poster: Face_ Date: Sep 29, 2008 11:41am
Forum: web Subject: Re: Help Needed Quick

Unfortunately, the Wayback machine is either unnable or unwilling to archive YouTube videos (I suspect unwilling, because that will be a LOT of data). But even if it was, it won't store YouTube userpages. YouTube itself does not permit this and, to my knowledge, never did. See: www.youtube.com/robots.txt. Are you sure that they did allow it for some time?

Another future TODO of medium prio: a few checks when submitting URLs through "Archive That", like checking if there's actually something on the given location, and if archiving is permited. Currently, it doesn't check anything.

Reply [edit]

Poster: Classic_TV_and_Radio_Fan Date: Sep 29, 2008 12:21pm
Forum: web Subject: Re: Help Needed Quick

But webcite archives it fine and it doesn't allow robots.txt-blocked pages to be archived.

Reply [edit]

Poster: Face_ Date: Sep 30, 2008 5:43am
Forum: web Subject: Re: Help Needed Quick

The WebCite bot doens't care about robots.txt files. A bot is not obliged to follow it, after all. Only when it's programmed to do so.

I found out what might be the source of your confusion. YouTube's robots.txt disallows bots to access "www.youtube.com/user/{username}", but it allows bots to access "www.youtube.com/{username}". In other words, this is allowed:

http://au.youtube.com/iLoveClassicTV

But this is not:

http://au.youtube.com/user/iLoveClassicTV

It's something of a 'security flaw'. Not that it makes much difference for you though. Neither the Wayback machine nor Webcite actually stores YouTube videos, so using them to archive YouTube is pointless.